文献中有许多不同的方法来解释机器学习结果。但是,这些方法的方法有所不同,通常没有提供相同的解释。在本文中,我们考虑了两种最新方法:集成梯度(Sundararajan,Taly和Yan,2017年)和基线Shapley(Sundararajan和Najmi,2020年)。原始作者已经研究了两种方法的公理属性,并提供了一些比较。我们的工作为表格数据提供了一些有关其比较行为的其他见解。我们讨论两者提供相同解释及其不同的常见情况。我们还使用仿真研究来检查具有Relu激活函数的神经网络拟合模型时的差异。
translated by 谷歌翻译
在机器学习(ML)社区中,低阶功能方差分析(FAROVA)模型以固有的可解释的机器学习为幌子。可解释的提升机或EBM(Lou等人,2013年)和Gami-Net(Yang等,2021)是最近提出的两种用于拟合功能性主要效应和二阶相互作用的ML算法。我们提出了一种称为Gami-Tree的新算法,类似于EBM,但具有许多可带来更好性能的功能。它使用基于模型的树作为基础学习者,并结合了一种新的交互过滤方法,可以更好地捕获基础交互。此外,我们的迭代训练方法会收敛到具有更好的预测性能的模型,并且嵌入式纯化确保相互作用在层次上是正交的,与主要效应是正交的。该算法不需要广泛的调整,我们的实施是快速有效的。我们使用模拟和真实数据集比较Gami-Tree与EBM和GAMI-NET的性能和解释性。
translated by 谷歌翻译
大多数机器学习(ML)算法具有多个随机元素,并且它们的性能受这些随机性来源的影响。本文使用一项经验研究来系统地检查两个来源的效果:模型训练中的随机性和在数据集分配到训练和测试子集中的随机性中。我们量化和比较以下ML算法的预测性能变化的幅度:随机森林(RFS),梯度增强机(GBMS)和前馈神经网络(FFNNS)。在不同的算法中,与基于树的方法相比,模型训练中的随机性会导致FFNN的变化更大。这是可以预期的,因为FFNN具有更多的随机元素,这些元素是其模型初始化和训练的一部分。我们还发现,与模型训练的固有随机性相比,数据集的随机分裂会导致更高的变化。如果原始数据集具有相当大的异质性,则数据拆分的变化可能是一个主要问题。关键字:模型培训,可重复性,变化
translated by 谷歌翻译
我们考虑了众包平台的成本优化利用问题,即给定规定的误差阈值,用于二进制,无监督分类的项目。假定众包平台上的工人根据他们的技能,经验和/或过去的表现,将其分为多个类。我们通过未知的混淆矩阵对每个工人类建模,并根据标签预测支付(已知的)价格。对于此设置,我们提出了用于从工人那里获取标签预测以及推断项目的真实标签的算法。我们证明,如果可用的(未标记)项目数量足够大,我们的算法满足规定的错误阈值,从而产生了几乎最佳的成本。最后,我们通过广泛的案例研究来验证我们的算法和一些受其启发的启发式启发。
translated by 谷歌翻译
康复练习对于确保迅速恢复卒中患者至关重要。设计自动化系统,以反复有助于患者进行康复运动。本报告中提供了设计过程。
translated by 谷歌翻译
我们调查部分观察到的Markov决策过程(POMDPS),通过描述状态,观察和控制不确定性的熵术语规范化的成本函数。标准POMDP技术显示为对这些熵正则化的POMDP提供有界误差解决方案,当正规化涉及状态,观察和控制轨迹的联合熵时,具有精确的解决方案。我们的联合熵结果特别令人惊讶,因为它构成了一种新颖的,无解决的活性状态估计的制剂。
translated by 谷歌翻译
While the brain connectivity network can inform the understanding and diagnosis of developmental dyslexia, its cause-effect relationships have not yet enough been examined. Employing electroencephalography signals and band-limited white noise stimulus at 4.8 Hz (prosodic-syllabic frequency), we measure the phase Granger causalities among channels to identify differences between dyslexic learners and controls, thereby proposing a method to calculate directional connectivity. As causal relationships run in both directions, we explore three scenarios, namely channels' activity as sources, as sinks, and in total. Our proposed method can be used for both classification and exploratory analysis. In all scenarios, we find confirmation of the established right-lateralized Theta sampling network anomaly, in line with the temporal sampling framework's assumption of oscillatory differences in the Theta and Gamma bands. Further, we show that this anomaly primarily occurs in the causal relationships of channels acting as sinks, where it is significantly more pronounced than when only total activity is observed. In the sink scenario, our classifier obtains 0.84 and 0.88 accuracy and 0.87 and 0.93 AUC for the Theta and Gamma bands, respectively.
translated by 谷歌翻译
Differentiable Architecture Search (DARTS) has attracted considerable attention as a gradient-based Neural Architecture Search (NAS) method. Since the introduction of DARTS, there has been little work done on adapting the action space based on state-of-art architecture design principles for CNNs. In this work, we aim to address this gap by incrementally augmenting the DARTS search space with micro-design changes inspired by ConvNeXt and studying the trade-off between accuracy, evaluation layer count, and computational cost. To this end, we introduce the Pseudo-Inverted Bottleneck conv block intending to reduce the computational footprint of the inverted bottleneck block proposed in ConvNeXt. Our proposed architecture is much less sensitive to evaluation layer count and outperforms a DARTS network with similar size significantly, at layer counts as small as 2. Furthermore, with less layers, not only does it achieve higher accuracy with lower GMACs and parameter count, GradCAM comparisons show that our network is able to better detect distinctive features of target objects compared to DARTS.
translated by 谷歌翻译
Deep learning techniques with neural networks have been used effectively in computational fluid dynamics (CFD) to obtain solutions to nonlinear differential equations. This paper presents a physics-informed neural network (PINN) approach to solve the Blasius function. This method eliminates the process of changing the non-linear differential equation to an initial value problem. Also, it tackles the convergence issue arising in the conventional series solution. It is seen that this method produces results that are at par with the numerical and conventional methods. The solution is extended to the negative axis to show that PINNs capture the singularity of the function at $\eta=-5.69$
translated by 谷歌翻译
We propose an ensemble approach to predict the labels in linear programming word problems. The entity identification and the meaning representation are two types of tasks to be solved in the NL4Opt competition. We propose the ensembleCRF method to identify the named entities for the first task. We found that single models didn't improve for the given task in our analysis. A set of prediction models predict the entities. The generated results are combined to form a consensus result in the ensembleCRF method. We present an ensemble text generator to produce the representation sentences for the second task. We thought of dividing the problem into multiple small tasks due to the overflow in the output. A single model generates different representations based on the prompt. All the generated text is combined to form an ensemble and produce a mathematical meaning of a linear programming problem.
translated by 谷歌翻译